rank | frequency | n-gram |
---|---|---|
1 | 10060 | -n |
2 | 8387 | -e |
3 | 5738 | -s |
4 | 4892 | -t |
5 | 3502 | -r |
rank | frequency | n-gram |
---|---|---|
1 | 8342 | -en |
2 | 2451 | -er |
3 | 1667 | -de |
4 | 1623 | -ng |
5 | 1326 | -es |
rank | frequency | n-gram |
---|---|---|
1 | 1414 | -ing |
2 | 1371 | -ten |
3 | 1109 | -ers |
4 | 1056 | -gen |
5 | 965 | -den |
rank | frequency | n-gram |
---|---|---|
1 | 604 | -eren |
2 | 557 | -ngen |
3 | 539 | -ende |
4 | 295 | -lijk |
5 | 288 | -sche |
rank | frequency | n-gram |
---|---|---|
1 | 482 | -ingen |
2 | 274 | -ische |
3 | 211 | -elijk |
4 | 210 | -lijke |
5 | 193 | -eerde |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings